Ontario
Out-of-Distribution Detection from Small Training Sets using Bayesian Neural Network Classifiers
Out-of-Distribution (OOD) detection is critical to AI reliability and safety, yet in many practical settings, only a limited amount of training data is available. Bayesian Neural Networks (BNNs) are a promising class of model on which to base OOD detection, because they explicitly represent epistemic (i.e. model) uncertainty. In the small training data regime, BNNs are especially valuable because they can incorporate prior model information. We introduce a new family of Bayesian posthoc OOD scores based on expected logit vectors, and compare 5 Bayesian and 4 deterministic posthoc OOD scores. Experiments on MNIST and CIFAR-10 In-Distributions, with 5000 training samples or less, show that the Bayesian methods outperform corresponding deterministic methods.
Personalized Control for Lower Limb Prosthesis Using Kolmogorov-Arnold Networks
Mohasel, SeyedMojtaba, Aghaei, Alireza Afzal, Pew, Corey
Objective: This paper investigates the potential of learnable activation functions in Kolmogorov-Arnold Networks (KANs) for personalized control in a lower-limb prosthesis. In addition, user-specific vs. pooled training data is evaluated to improve machine learning (ML) and Deep Learning (DL) performance for turn intent prediction. Method: Inertial measurement unit (IMU) data from the shank were collected from five individuals with lower-limb amputation performing turning tasks in a laboratory setting. Ability to classify an upcoming turn was evaluated for Multilayer Perceptron (MLP), Kolmogorov-Arnold Network (KAN), convolutional neural network (CNN), and fractional Kolmogorov-Arnold Networks (FKAN). The comparison of MLP and KAN (for ML models) and FKAN and CNN (for DL models) assessed the effectiveness of learnable activation functions. Models were trained separately on user-specific and pooled data to evaluate the impact of training data on their performance. Results: Learnable activation functions in KAN and FKAN did not yield significant improvement compared to MLP and CNN, respectively. Training on user-specific data yielded superior results compared to pooled data for ML models ($p < 0.05$). In contrast, no significant difference was observed between user-specific and pooled training for DL models. Significance: These findings suggest that learnable activation functions may demonstrate distinct advantages in datasets involving more complex tasks and larger volumes. In addition, pooled training showed comparable performance to user-specific training in DL models, indicating that model training for prosthesis control can utilize data from multiple participants.
SIM-Shapley: A Stable and Computationally Efficient Approach to Shapley Value Approximation
Fan, Wangxuan, Li, Siqi, Zhou, Doudou, Okada, Yohei, Hong, Chuan, Liu, Molei, Liu, Nan
Explainable artificial intelligence (XAI) is essential for trustworthy machine learning (ML), particularly in high-stakes domains such as healthcare and finance. Shapley value (SV) methods provide a principled framework for feature attribution in complex models but incur high computational costs, limiting their scalability in high-dimensional settings. We propose Stochastic Iterative Momentum for Shapley Value Approximation (SIM-Shapley), a stable and efficient SV approximation method inspired by stochastic optimization. We analyze variance theoretically, prove linear $Q$-convergence, and demonstrate improved empirical stability and low bias in practice on real-world datasets. In our numerical experiments, SIM-Shapley reduces computation time by up to 85% relative to state-of-the-art baselines while maintaining comparable feature attribution quality. Beyond feature attribution, our stochastic mini-batch iterative framework extends naturally to a broader class of sample average approximation problems, offering a new avenue for improving computational efficiency with stability guarantees. Code is publicly available at https://github.com/nliulab/SIM-Shapley.
Evaluating Alternative Training Interventions Using Personalized Computational Models of Learning
MacLellan, Christopher James, Stowers, Kimberly, Brady, Lisa
Evaluating different training interventions to determine which produce the best learning outcomes is one of the main challenges faced by instructional designers. Typically, these designers use A/B experiments to evaluate each intervention; however, it is costly and time consuming to run such studies. To address this issue, we explore how computational models of learning might support designers in reasoning causally about alternative interventions within a fractions tutor. We present an approach for automatically tuning models to specific individuals and show that personalized models make better predictions of students' behavior than generic ones. Next, we conduct simulations to generate counterfactual predictions of performance and learning for two students (high and low performing) in different versions of the fractions tutor. Our approach makes predictions that align with previous human findings, as well as testable predictions that might be evaluated with future human experiments.
Social and Ethical Risks Posed by General-Purpose LLMs for Settling Newcomers in Canada
Nejadgholi, Isar, Molamohammadi, Maryam, Bakhtawar, Samir
The non-profit settlement sector in Canada supports newcomers in achieving successful integration. This sector faces increasing operational pressures amidst rising immigration targets, which highlights a need for enhanced efficiency and innovation, potentially through reliable AI solutions. The ad-hoc use of general-purpose generative AI, such as ChatGPT, might become a common practice among newcomers and service providers to address this need. However, these tools are not tailored for the settlement domain and can have detrimental implications for immigrants and refugees. We explore the risks that these tools might pose on newcomers to first, warn against the unguarded use of generative AI, and second, to incentivize further research and development in creating AI literacy programs as well as customized LLMs that are aligned with the preferences of the impacted communities. Crucially, such technologies should be designed to integrate seamlessly into the existing workflow of the settlement sector, ensuring human oversight, trustworthiness, and accountability.
Application of Artificial Intelligence in Supporting Healthcare Professionals and Caregivers in Treatment of Autistic Children
Rouzbahani, Hossein Mohammadi, Karimipour, Hadis
Treatment plans often involve multiple neurodevelopmental condition marked by difficulties in social sessions with different therapists, and the absence of a standardized interaction, communication impediments, and repetitive behaviors. This fragmented approach continue to pose significant challenges due to the variability in can impede effective communication and coordination among symptomatology and the necessity for multidisciplinary care healthcare providers, adversely affecting the quality of care. This paper investigates the potential of Artificial Furthermore, parents and caregivers may find it challenging to access Intelligence (AI) to augment the capabilities of healthcare and manage the extensive records necessary for consistent treatment, professionals and caregivers in managing ASD. We have developed further complicating the overall management of ASD. a sophisticated algorithm designed to analyze facial and bodily Artificial Intelligence (AI) presents a promising solution to the expressions during daily activities of both autistic and non-autistic complexities involved in diagnosing and treating Autism Spectrum children, leading to the development of a powerful deep learningbased Disorder (ASD) [6]. AI-powered tools have the potential to autism detection system. Our study demonstrated that AI standardize the diagnostic process by analyzing extensive datasets to models, specifically the Xception and ResNet50V2 architectures, uncover patterns and correlations that might be overlooked by human achieved high accuracy in diagnosing Autism Spectrum Disorder evaluators.
BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM
Xu, Wenda, Li, Jiachen, Wang, William Yang, Li, Lei
Direct alignment from preferences (DAP) has emerged as a promising paradigm for aligning large language models (LLMs) to human desiderata from pre-collected, offline preference datasets. While recent studies indicate that existing offline DAP methods can directly benefit from online training samples, we highlight the need to develop specific online DAP algorithms to fully harness the power of online training. Specifically, we identify that the learned LLM should adhere to the proximity of the behavior LLM, which collects the training samples. To this end, we propose online Preference Optimization in proximity to the Behavior LLM (BPO), emphasizing the importance of constructing a proper trust region for LLM alignment. We conduct extensive experiments to validate the effectiveness and applicability of our approach by integrating it with various DAP methods, resulting in significant performance improvements across a wide range of tasks when training with the same amount of preference data. Even when only introducing one additional data collection phase, our online BPO improves its offline DAP baseline from 72.0% to 80.2% on TL;DR and from 82.2% to 89.1% on Anthropic Helpfulness in terms of win rate against human reference text.
Binder: Hierarchical Concept Representation through Order Embedding of Binary Vectors
Gyurek, Croix, Talukder, Niloy, Hasan, Mohammad Al
For natural language understanding and generation, embedding concepts using an order-based representation is an essential task. Unlike traditional point vector based representation, an order-based representation imposes geometric constraints on the representation vectors for explicitly capturing various semantic relationships that may exist between a pair of concepts. In existing literature, several approaches on order-based embedding have been proposed, mostly focusing on capturing hierarchical relationships; examples include vectors in Euclidean space, complex, Hyperbolic, order, and Box Embedding. Box embedding creates region-based rich representation of concepts, but along the process it sacrifices simplicity, requiring a custom-made optimization scheme for learning the representation. Hyperbolic embedding improves embedding quality by exploiting the ever-expanding property of Hyperbolic space, but it also suffers from the same fate as box embedding as gradient descent like optimization is not simple in the Hyperbolic space. In this work, we propose Binder, a novel approach for order-based representation. Binder uses binary vectors for embedding, so the embedding vectors are compact with an order of magnitude smaller footprint than other methods. Binder uses a simple and efficient optimization scheme for learning representation vectors with a linear time complexity. Our comprehensive experimental results show that Binder is very accurate, yielding competitive results on the representation task. But Binder stands out from its competitors on the transitive closure link prediction task as it can learn concept embeddings just from the direct edges, whereas all existing order-based approaches rely on the indirect edges.
PATCH -- Psychometrics-AssisTed benCHmarking of Large Language Models: A Case Study of Mathematics Proficiency
Fang, Qixiang, Oberski, Daniel L., Nguyen, Dong
Many existing benchmarks of large (multimodal) language models (LLMs) focus on measuring LLMs' academic proficiency, often with also an interest in comparing model performance with human test takers. While these benchmarks have proven key to the development of LLMs, they suffer from several limitations, including questionable measurement quality (e.g., Do they measure what they are supposed to in a reliable way?), lack of quality assessment on the item level (e.g., Are some items more important or difficult than others?) and unclear human population reference (e.g., To whom can the model be compared?). In response to these challenges, we propose leveraging knowledge from psychometrics - a field dedicated to the measurement of latent variables like academic proficiency - into LLM benchmarking. We make three primary contributions. First, we introduce PATCH: a novel framework for Psychometrics-AssisTed benCHmarking of LLMs. PATCH addresses the aforementioned limitations, presenting a new direction for LLM benchmark research. Second, we implement PATCH by measuring GPT-4 and Gemini-Pro-Vision's proficiency in 8th grade mathematics against 56 human populations. We show that adopting a psychometrics-based approach yields evaluation outcomes that diverge from those based on existing benchmarking practices. Third, we release 4 datasets to support measuring and comparing LLM proficiency in grade school mathematics and science against human populations.
"This is not a data problem": Algorithms and Power in Public Higher Education in Canada
Algorithmic decision-making is increasingly being adopted across public higher education. The expansion of data-driven practices by post-secondary institutions has occurred in parallel with the adoption of New Public Management approaches by neoliberal administrations. In this study, we conduct a qualitative analysis of an in-depth ethnographic case study of data and algorithms in use at a public college in Ontario, Canada. We identify the data, algorithms, and outcomes in use at the college. We assess how the college's processes and relationships support those outcomes and the different stakeholders' perceptions of the college's data-driven systems. In addition, we find that the growing reliance on algorithmic decisions leads to increased student surveillance, exacerbation of existing inequities, and the automation of the faculty-student relationship. Finally, we identify a cycle of increased institutional power perpetuated by algorithmic decision-making, and driven by a push towards financial sustainability.